Dublin City University at the TweetMT 2015 Shared Task

نویسندگان

  • Antonio Toral
  • Xiaofeng Wu
  • Tommi A. Pirinen
  • Zhengwei Qiu
  • Ergun Biçici
  • Jinhua Du
چکیده

We describe our participation in TweetMT for three language pairs in both directions: Spanish from/to Catalan, Basque and Portuguese. We used a range of techniques: statistical and rule-based MT, morph segmentation, data selection with ParFDA and system combination. As for resources, our focus was on crawling vast amounts of tweets to perform monolingual domain adaptation. Our system was the best of all systems submitted for five out of the six language directions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EHU at TweetMT: Adapting MT Engines for Formal Tweets

This paper describes the participation of the IXA group from the UPV/EHU (University of the Basque Country) in the TweetMT shared task at the SEPLN-2015 conference. We have adapted existing MT engines for the es-eu and eu-es pairs, obtaining good results (better than other experiments reported in previous work). Three main aspects are described: resource compilation, engine adaptation and results.

متن کامل

Overview of TweetMT: A Shared Task on Machine Translation of Tweets at SEPLN 2015

This article presents an overview of the shared task that took place as part of the TweetMT workshop held at SEPLN 2015. The task consisted in translating collections of tweets from and to several languages. The article outlines the data collection and annotation process, the development and evaluation of the shared task, as well as the results achieved by the participants.

متن کامل

The DCU Discourse Parser: A Sense Classification Task

This paper describes the discourse parsing system developed at Dublin City University for participation in the CoNLL 2015 shared task. We participated in two tasks: a connective and argument identification task and a sense classification task. This paper focuses on the latter task and especially the sense classification for implicit connectives.

متن کامل

The UPC TweetMT participation: Translating Formal Tweets Using Context Information

In this paper, we describe the UPC systems that participated in the TweetMT shared task. We developed two main systems that were applied to the Spanish–Catalan language pair: a state-of-the-art phrase-based statistical machine translation system and a context-aware system. In the second approach, we define the “context” for a tweet as the tweets of a user produced in the same day, and also, we ...

متن کامل

Exploration of Feature Combination in Geo-visual Ranking for Visual Content-based Location Prediction

DAY 1: FIRST MORNING SESSION 9:00–9:15 Opening Brief words of welcome by Martha Larson and Gareth Jones 9:15–10:15 Search and Hyperlinking of Television Content Chair: Maria Eskevich (Dublin City University, Ireland) I. (20 min.) Search and Hyperlinking Task overview: The Search and Hyperlinking Task at MediaEval 2013 (presenter: Robin Aly, University of Twente, Netherlands) II. (10 min.) Linke...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015